95 research outputs found

    ANNOTATED DISJUNCT FOR MACHINE TRANSLATION

    Get PDF
    Most information found in the Internet is available in English version. However, most people in the world are non-English speaker. Hence, it will be of great advantage to have reliable Machine Translation tool for those people. There are many approaches for developing Machine Translation (MT) systems, some of them are direct, rule-based/transfer, interlingua, and statistical approaches. This thesis focuses on developing an MT for less resourced languages i.e. languages that do not have available grammar formalism, parser, and corpus, such as some languages in South East Asia. The nonexistence of bilingual corpora motivates us to use direct or transfer approaches. Moreover, the unavailability of grammar formalism and parser in the target languages motivates us to develop a hybrid between direct and transfer approaches. This hybrid approach is referred as a hybrid transfer approach. This approach uses the Annotated Disjunct (ADJ) method. This method, based on Link Grammar (LG) formalism, can theoretically handle one-to-one, many-to-one, and many-to-many word(s) translations. This method consists of transfer rules module which maps source words in a source sentence (SS) into target words in correct position in a target sentence (TS). The developed transfer rules are demonstrated on English → Indonesian translation tasks. An experimental evaluation is conducted to measure the performance of the developed system over available English-Indonesian MT systems. The developed ADJ-based MT system translated simple, compound, and complex English sentences in present, present continuous, present perfect, past, past perfect, and future tenses with better precision than other systems, with the accuracy of 71.17% in Subjective Sentence Error Rate metric

    ANNOTATED DISJUNCT FOR MACHINE TRANSLATION

    Get PDF
    Most information found in the Internet is available in English version. However, most people in the world are non-English speaker. Hence, it will be of great advantage to have reliable Machine Translation tool for those people. There are many approaches for developing Machine Translation (MT) systems, some of them are direct, rule-based/transfer, interlingua, and statistical approaches. This thesis focuses on developing an MT for less resourced languages i.e. languages that do not have available grammar formalism, parser, and corpus, such as some languages in South East Asia. The nonexistence of bilingual corpora motivates us to use direct or transfer approaches. Moreover, the unavailability of grammar formalism and parser in the target languages motivates us to develop a hybrid between direct and transfer approaches. This hybrid approach is referred as a hybrid transfer approach. This approach uses the Annotated Disjunct (ADJ) method. This method, based on Link Grammar (LG) formalism, can theoretically handle one-to-one, many-to-one, and many-to-many word(s) translations. This method consists of transfer rules module which maps source words in a source sentence (SS) into target words in correct position in a target sentence (TS). The developed transfer rules are demonstrated on English → Indonesian translation tasks. An experimental evaluation is conducted to measure the performance of the developed system over available English-Indonesian MT systems. The developed ADJ-based MT system translated simple, compound, and complex English sentences in present, present continuous, present perfect, past, past perfect, and future tenses with better precision than other systems, with the accuracy of 71.17% in Subjective Sentence Error Rate metric

    Computer Program Software for Determining formal Symmetry of Evolution Eqations

    Get PDF
    The existence of formal symmetry of an evolution equation is one of the criteria of the complete integrability or solvability of evolution equations, due to Sokolov and Shabat. Many evolution equations such as the soliton (solitary equation) of Korteweg-de Vries (KdV) equation have been found recently to have various kinds of explicit integral or solutions. Such evolution equations admit infinitely many symmetries or admit the recursion operator. In this paper we introduce the definition of the formal symmetry. Formal symmetry is the approximation of the recursion operator, which brings us to a convenient way of characterizing equations admitting infinitely many symmetries. In this research, we developed a program for computing the formal symmetries of evolution equations. To verify the correctness of the program, we apply it to some evolution equations (as testing equations), which have been proved to be formally completely integrable. The program we obtained can compute the formal symmetry of finite arbitrary order (up to order 18) of the testing equations, which verify the correctness of the program

    Arsitektur Jaringan Neural Berbasis Simpul Ram Untuk Pengenalan Huruf

    Get PDF
    Every handwritten letter is obviously different depending on who writes it. Similarly letters printed from a computer are also different depending on the type of font selected and the type of the printer. In that sense, a method which recognizes letters is needed, among which is neural network method. Examples of neural network method are Adaline, Madaline and Backward Propagation. But the disadvantage of the mentioned methods is that they have interconnection Weights which need a lot of iterations so that the computation time is longer. In this study, a neural network based on RAM Node is used, which has a considerable shorter computation time because it doesn\u27t involve weight vectors in it\u27s process. In this case, with an input letter pattern of the 64 x 48 pixels binary image and by using Turbo C++ version 1.0, we obtain a recognition time less than 2 seconds. While if another method was used, for example Backward Propagation, it could have consumed time in the order minutes or even hours

    Point of Interest (POI) Recommendation System using Implicit Feedback Based on K-Means+ Clustering and User-Based Collaborative Filtering

    Get PDF
    Recommendation system always involves huge volumes of data, therefore it causes the scalability issues that do not only increase the processing time but also reduce the accuracy. In addition, the type of data used also greatly affects the result of the recommendations. In the recommendation system, there are two common types of data namely implicit (binary) rating and explicit (scalar) rating. Binary rating produces lower accuracy when it is not handled with the properly. Thus, optimized K-Means+ clustering and user-based collaborative filtering are proposed in this research. The K-Means clustering is optimized by selecting the K value using the Davies-Bouldin Index (DBI) method. The experimental result shows that the optimization of the K values produces better clustering than Elbow Method. The K-Means+ and User-Based Collaborative Filtering (UBCF) produce precision of 8.6% and f-measure of 7.2%, respectively. The proposed method was compared to DBSCAN algorithm with UBCF, and had better accuracy of 1% increase in precision value. This result proves that K-Means+ with UBCF can handle implicit feedback datasets and improve precision

    Impact of Matrix Factorization and Regularization Hyperparameter on a Recommender System for Movies

    Get PDF
    Recommendation system is developed to match consumers with product to meet their variety of special needs and tastes in order to enhance user satisfaction and loyalty. The popularity of personalized recommendation system has been increased in recent years and applied in several areas include movies, songs, books, news, friend recommendations on social media, travel products, and other products in general. Collaborative Filtering methods are widely used in recommendation systems. The collaborative filtering method is divided into neighborhood-based and model-based. In this study, we are implementing matrix factorization which is part of model-based that learns latent factor for each user and item and uses them to make rating predictions. The method will be trained using stochastic gradient descent with additional tricks and optimization of regularization hyperparameter. In the end, neighborhood-based collaborative filtering and matrix factorization with different values of regularization hyperparameter will be compared. Our result shows that matrix factorization method with lowest regularization hyperparameter outperformed the other methods in term of RMSE score. In this study, the used functions are available from Graphlab and using Movielens 100k data set for building the recommendation systems

    Perbandingan Performa Relational, Document-Oriented dan Graph Database Pada Struktur Data Directed Acyclic Graph

    Get PDF
    Abstract.Directed Acyclic Graph (DAG) is a directed graph which is not cyclic and is usually employed in social network and data genealogy. Based on the characteristic of DAG data, a suitable database type should be evaluated and then chosen as a platform. A performance comparison among relational database (PostgreSQL), document-oriented database (MongoDB), and graph database (Neo4j) on a DAG dataset are then conducted to get the appropriate database type. The performance test is done on Node.js running on Windows 10 and uses the dataset that has 3910 nodes in single write synchronous (SWS) and single read (SR). The access performance of PostgreSQL is 0.64ms on SWS and 0.32ms on SR, MongoDB is 0.64ms on SWS and 4.59ms on SR, and Neo4j is 9.92ms on SWS and 8.92ms on SR. Hence, relational database (PostgreSQL) has better performance in the operation of SWS and SR than document-oriented database (MongoDB) and graph database (Neo4j).Keywords: database performance, directed acyclic graph, relational database, document-oriented database, graph database Abstrak.Directed Acyclic Graph (DAG) adalah graf berarah tanpa putaran yang dapat ditemui pada data jejaring sosial dan silsilah keluarga. Setiap jenis database memiliki performa yang berbeda sesuai dengan struktur data yang ditangani. Oleh karena itu perlu diketahui database yang tepat khususnya untuk data DAG. Tujuan penelitian ini adalah membandingkan performa dari relational database (PostgreSQL), document-oriented database (MongoDB) dan graph database (Neo4j) pada data DAG. Metode yang dilakukan adalah mengimplentasi dataset yang memiliki 3910 node dalam operasi single write synchronous (SWS) dan single read (SR) pada setiap database menggunakan Node.js dalam Windows 10. Hasil pengujian performa PostgreSQL dalam operasi SWS sebesar 0.64ms dan SR sebesar 0.32ms, performa MongoDB pada SWS sebesar 0.64ms dan SR sebesar 4.59ms sedangkan performa Neo4j pada operasi SWS sebesar 9.92ms dan SR sebesar 8.92ms. Hasil penelitian menunjukan bahwa relational database (PostgreSQL) memiliki performa terbaik dalam operasi SWS dan SR dibandingkan document-oriented database (MongoDB) dan graph database (Neo4j).Kata Kunci: performa database, directed acyclic graph, relational database, document-oriented database, graph databas

    Stemming Influence on Similarity Detection of Abstract Written in Indonesia

    Get PDF
    In this paper we would like to discuss about stemming effect by using Nazief and Adriani algorithm against similarity detection result of Indonesian written abstract. The contents of the publication abstract similarity detection can be used as an early indication of whether or not the act of plagiarism in a writing. Mostly in processing the text adding a pre-process, one of it which is called a stemming by changing the word into the root word in order to maximize the searching process. The result of stemming process will be changed as a certain word n-gram set then applied an analysis of similarity using Fingerprint Matching to perform similarity matching between text. Based on the F1-score which used to balance the precision and recall number, the detection that implements stemming and stopword removal has a better result in detecting similarity between the text with an average is 42%. It is higher comparing to the similarity detection by using only stemming process (31%) or the one that was done without involving the text pre-process (34%) while applying the bigram

    User Curiosity Factor in Determining Serendipity of Recommender System

    Get PDF
    Recommender rystem (RS) is created to solve the problem by recommending some items among a huge selection of items that will be useful for the e-commerce users. RS prevents the users from being flooded by information that is irrelevant for them.Unlike information retrieval (IR) systems, the RS system's goal is to present information to the users that is accurate and preferably useful to them. Too much focus on accuracy in RS may lead to an overspecialization problem, which will decrease its effectiveness. Therefore, the trend in RS research is focusing beyond accuracy methods, such as serendipity. Serendipity can be described as an unexpected discovery that is useful. Since the concept of a recommendation system is still evolving today, formalizing the definition of serendipity in a recommendation system is very challenging.One known subjective factor of serendipity is curiosity. While some researchers already addressed curiosity factor, it is found that the relationships between various serendipity component as perceived by the users and their curiosity levels is still yet to be researched. In this paper, the method to determine user curiosity model by considering the variation of rated items was presented, then relation to serendipity components using existing user feedback data was validated. The finding showed that the curiosity model was related to some user-perceived values of serendipity, but not all. Moreover, it also had positive effect on broadening the user preference.

    Performance Improvement Using CNN for Sentiment Analysis

    Get PDF
    The approach using Deep Learning method provides great results in various field implementations, especially in the field of Sentiment Analysis. One of Deep Learning methods is CNN which has the ability to provide great accuracy in some previous research. However, there are some parts of the training process which can be improved to upgrade the accuracy level and the training time. In this paper, we try to improve the accuracy and processing time of sentiment analysis using CNN model. By tuning the filter size, frameworks, and pre-training, the results show that the use of smaller filter size and pre-training word2vec provide greater accuracy than some previous studies
    • …
    corecore